Integration of population-level data sources into an individual-level clinical prediction model for dengue virus test positivity

The differentiation of dengue virus (DENV) infection, a major cause of acute febrile illness in tropical regions, from other etiologies, may help prioritize laboratory testing and limit the inappropriate use of antibiotics. While traditional clinical prediction models focus on individual patient-level parameters, we hypothesize that for infectious diseases, population-level data sources may improve predictive ability. To create a clinical prediction model that integrates patient-extrinsic data for identifying DENV among febrile patients presenting to a hospital in Thailand, we fit random forest classifiers combining clinical data with climate and population-level epidemiologic data. In cross-validation, compared to a parsimonious model with the top clinical predictors, a model with the addition of climate data, reconstructed susceptibility estimates, force of infection estimates, and a recent case clustering metric significantly improved model performance.


INTRODUCTION
Acute febrile illness (AFI) is a common reason for seeking health care in low-and middle-income countries (LMICs) (1).Determination of AFI etiology is often limited by diagnostic testing capacity, given the wide spectrum of potential infectious agents.Inappropriate use of testing and treatment resources may result in poor outcomes, such as the high case fatality rates seen in admitted AFI patients (5 to 20%) (2)(3)(4)(5)(6)(7).Dengue virus (DENV) is a major cause of AFI in LMICs, accounting for an estimated 390 million infections, 96 million illnesses, 2 million severe cases, and 21,000 deaths per year (8).The differentiation between dengue and other common causes of febrile illness is important to avoid misdiagnosis, which can lead to delays in initiation of effective treatment and inappropriate use of antibiotics (9).Because of the lack of pathognomonic clinical features that reliably distinguish dengue from other febrile illnesses, virological or serological laboratory confirmation is required for definitive diagnosis.While multiplexed tests that can quickly identify the causative pathogen are ideal, they are often unavailable in LMICs due to cost and insufficient laboratory infrastructure.Even rapid, point-of-care tests may be cost-prohibitive in LMICs (10).Accurate and costeffective tools to better determine etiology of fever at the point of care are greatly needed to guide the use of diagnostics and therapeutics, conserving scarce health care resources.
Clinical decision support systems (CDSS) incorporating prediction models may offer a solution to better management of infectious diseases in low resource settings.CDSSs, such as applications on smartphone devices, can gather data from a range of online sources and implement sophisticated clinical prediction models that would be impractical for clinicians to calculate manually.CDSS have proven effective at improving therapeutic management and reducing unnecessary diagnostic tests in both high-income countries (11) and LMICs (12)(13)(14).In Bangladesh, an electronic CDSS was shown to improve clinical dehydration assessment and World Health Organization (WHO) diarrhea guideline adherence, as well as reduce nonindicated antibiotic use in children under five by 29% (12).Traditional predictive models generally incorporate clinical information that is obtained solely from the presenting patient.Predictive models that incorporate additional information-such as seasonal or climate predictors, location-specific historical prevalence, and characteristics of prior patients-have been shown to increase diagnostic accuracy and limit inappropriate antibiotic use (14)(15)(16).
The underlying probability of being infected by DENV varies by both space and time.The risk of DENV transmission depends on conditions that promote mosquito breeding, including when temperatures are warmer (17)(18)(19), and the risk of infection is influenced by local population immunity, as large outbreak years are typically followed by periods of low transmission (20)(21)(22).As most DENV transmission is highly focal, it means that population susceptibility profiles can be spatially heterogeneous at any time (21,(23)(24)(25).Thus, our objective is to develop an improved clinical prediction model for dengue by integrating temporal and spatial (location-specific) parameters including climate data, clustering of recent cases, and population susceptibility estimates derived from seroprevalence or hospital data in the surrounding community.We demonstrate the potential for integrating location-and population-specific data sources into clinical prediction models.This approach has the potential to inform the development of improved tools to aid clinicians in diagnostic and therapeutic decision making for patients presenting with suspected dengue.

RESULTS
Of the 12,833 participants in the clinical dataset, 5731 (45%) were confirmed to have DENV infection by polymerase chain reaction (PCR).DENV-positive patients were significantly younger (18 versus 22 years, P < 0.001; Table 1).Nearly all cases (97.8%) came from the 11 districts within Kamphaeng Phet province (Table 1).There was no significant difference between the probability of testing positive for males and females (P = 0.07); no other genders were reported.The probability of testing positive differed substantially by age, ranging from 26% for those <4 years to 58% for those 15 to 19 years of age (Table 2).Patients between the ages of 10 and 14 years, 15 and 19 years, and 5 and 9 years comprised the largest proportion of cases (23,18, and 16%, respectively), while older patients comprised a much smaller proportion of cases (30 to 34 years, 5%; 35 to 39 years, 4%).
We found that there were significant differences in many of the clinical symptoms reported by DENV-positive and DENV-negative patients.Table 1 lists the top discriminative symptoms between the groups based on random forest and logistic regression.The most common symptom reported was fever, followed by headache.In univariate analysis, we found that individuals with fever, chills, malaise, retro-orbital pain, nausea, headache, and vomiting were significantly more likely to test positive for DENV, and individuals with cough, rhinitis, and pharyngitis were significantly less likely to test positive for DENV (table S2).
When we examined the proportion of positive cases to total cases by year and month, we found that both total and positive cases significantly increased in the months between June and September (P < 0.001, χ 2 test).The proportion of positive cases differed substantially by year (P < 0.001, χ 2 test), ranging from 19% in 2016 to 90% in 2017.The period of lowest test positivity in 2016 and 2017 coincided with the Zika virus epidemic in the country (Fig. 1).

Model performance evaluation using only clinical predictors and parsimonious variable selection
We first assessed the performance of the model using a traditional clinical prediction model, which only includes the presenting patient's information.A random forest classifier using all 23 clinical features resulted in an average area under the receiver operator characteristic curve (AUC) of 69.5% [95% confidence interval (CI): 67.5 to 71.5] from repeated cross-validation.To determine the optimal number of variables for a parsimonious prediction model, we used a random forest classifier to analyze the improvement in model performance with each additional clinical variable included.Figure 2 shows the improvement in AUC with each additional variable using two random forest classifiers-one with all other predictors and the other using only clinical data-as well as a logistic regression model using only clinical variables.Performance leveled off with three clinical variables: age, cough, and nausea.Using a model with only these three predictors, we achieve an average AUC of 67.0% (95% CI: 65.0 to 69.1).Table S3 shows the relative frequency of these variables by age group.We demonstrate the direction and magnitude of the effect of the top predictors by generating partial dependence plots from random forest and logistic regression classifiers (fig.S1).

Addition of climate data to the clinical parameter model resulted in an improved AUC
Next, we fit models using climate data.To appropriately adjust lag time for each climate variable, we fit a random forest classifier using only climate variables and assessed the variables of importance by AUC.A random forest model with recent and lagged aggregated climate data without clinical predictors resulted in an AUC of 58.7% (95% CI: 56.5 to 60.9).We found that the best performing climate variables were visibility, relative humidity, wind speed, and precipitation, all lagged by 3 months.We examined the relationship between the top two performing climate predictors-visibility and relative humidity-with the proportion of positive cases each month (Fig. 3).For each climate predictor, table S4 lists the odds ratio and compares the mean of each predictor by DENV-positive or DENV-negative groups.When combined with the top three clinical variables, climate data performed similarly (AUC of 67.2%, 95% CI: 65.2 to 69.3) as clinical data alone (AUC of 67.0%, 95% CI: 65.0 to 69.1) (median P = 0.60, 2% P < 0.05).However, when climate data were combined with all other predictors, model performance improved from an AUC of 68.4% (95% CI: 66.4 to 70.4) to an AUC of 70.0% (95% CI: 67.9 to 71.0; median P = 0.07, 45% P < 0.05).To assess whether integrating more location-specific climate data would improve performance, we fit models using climate data from each case's home district; however, model performance did not noticeably change.
Table 2 shows the AUCs for the clinical base model, compared to the base model plus the inclusion of additional data sources.

Addition of RS estimates to the clinical parameter model resulted in an improved AUC
Using historical hospital case data from the province, we obtained estimates of the size of the susceptible population by age for each year (across all subdistricts in the province).In our predictive model, we used the prior year's reconstructed susceptibility (RS) estimates.Using logistic regression, we found that secondary RS estimates performed better than primary RS estimates [60.7% (95% CI: 58.6 to 62.9) versus 52.3% (95% CI: 50.1 to 54.6)].When added to a random forest classifier with climate and/or clinical predictors, the inclusion of RS estimates consistently resulted in higher AUCs (Table 2).When added to the top three clinical parameters alone, RS estimates nonsignificantly improved model performance from an AUC of 67.0% (95% CI: 65.0 to 68.8) to an AUC of 67.5% (95% CI: 65.4 to 69.5) (median P = 0.40, 9% P < 0.05).Last, a model including all predictors resulted in higher AUCs than a model without RS (median P = 0.09, 32% P < 0.05).

Addition of subdistrict-specific FoI estimates to the clinical parameter model resulted in an improved AUC
We incorporated force of infection (FoI) estimates for each age by subdistrict using data from a local cohort study.This assumes that the underlying differences in the FoI are constant in time.Using logistic regression, FoI estimates had an AUC of 57.0% (95% CI: 54.8 to 59.2).The inclusion of FoI estimates leads to increases in AUC when added to the top clinical predictors, when added to clinical predictors and climate data, and when added to clinical predictors, climate predictors, and RS estimates (Table 2).When included with all other predictors, a model with FoI estimates nonsignificantly improved performance compared to a model without FoI estimates (median P = 0.30, 23% P < 0.05).

Addition of the case clustering metric to the clinical parameter model resulted in an improved AUC
Last, we fit a model that assessed for clustering of recent cases based on prior patients presenting to the Kamphaeng Phet Provincial Hospital (KPPH).Using logistic regression, we found that the case clustering metric (the number of positive cases in the subdistrict over the past 30 days divided by the total number of cases from that subdistrict in the study period) had an AUC of 56.4% (95% CI: 54.2 to 58.6).We found that the use of the case clustering metric consistently improved model performance.Stratifying by the finer spatial size of subdistrict consistently outperformed models with prior patients stratified by province.When added to the top performing clinical variables, model performance significantly improved (median P = 0.02, 60% of P < 0.05).When compared to a model with all predictors except cluster of recent cases, the inclusion of this predictor significantly improved model performance (median P = 0.007, 79% P < 0.05).
Last, when comparing a model including all predictors with a model including only the top clinical predictors, model performance improved from an AUC of 67.0% (95% CI: 65.0 to 69.1) to an AUC of 70.0% (95% CI: 67.9 to 71.9) (median P = 0.006, 87% P < 0.05).Our model had a sensitivity of 55.3%, a specificity of 70.2%, a positive predictive value (PPV) of 60.0%, and a negative predictive value (NPV) of 66.1%.

DISCUSSION
Insufficient diagnostic testing capacity in LMICs necessitates innovative approaches to support clinical decision-making.Here, we present a predictive model for DENV infection that integrates multiple sources of information both intrinsic and extrinsic to the patient, including climate data, clinical data, seroprevalence-based susceptibility estimates, and historical information from prior patients, which results in improved predictive performance.While the model with all predictors included did significantly outperform the base parsimonious model with only clinical predictors (median P = 0.006, 87% P < 0.05), whether the additional 3.0% improvement in AUC is clinically useful may be case and clinician dependent.Certain components of our model require data from sero-surveillance, which may not be accessible in all communities.However, simplifying the model by including only the top clinical predictors and the case cluster metric alone results in an AUC decrease of only 1.3%.These metrics are more readily obtainable and, notably, do not necessitate laboratory resources.Nevertheless, we believe that the results demonstrate a proof of concept that seroprevalence-based susceptibility estimates and climate data can be used to improve predictive performance and may be useful to augment prediction in other communicable diseases.
There is a lack of information on the deficiency of testing capacity both in Thailand and globally in LMICs.Accurately quantifying the true extent of diagnostic testing deficiencies is challenging as LMICs often lack robust national surveillance systems.In a Brazilian study between the years 2010 and 2019, where every suspected case of dengue was recorded in a national surveillance database, only 11% of the 350,000 cases of suspected dengue infection were ultimately tested (26).If we extrapolate the results from Brazil to other LMICs, as much as 90% of dengue-like illness may go undiagnosed, highlighting the need for tools to bridge the diagnostic gap.
DENV transmission can exhibit temporal and geographical heterogeneity even at fine spatial scales, with variations observed even among neighboring villages (31)(32)(33).We thus used patient-extrinsic (location-specific) data sources in our models.The improvement in AUC with finer spatial units suggests that population-level spatial heterogeneity exists at the district level and can be applied to individual-level clinical prediction.We expect further improvements in predictive performance if finer-scale location became routinely available for case data, such as to the community level.The improvement with the use of either the province or district level case clustering metric highlights the utility of temporal predictors in clinical prediction DENV models.
Spatial heterogeneity in dengue incidence may be explained in part by micro-climates, which can modify transmission dynamics at small scales.For example, within urban heat islands, temperature variations of up 10°C compared to other city areas may create conditions more conducive to dengue transmission in cooler temperatures (34).We collected all climate data from the provincial weather station in Kamphaeng Phet.We attempted to integrate climate data at a more localized level; however, several subdistricts do not have weather stations or weather station data were incomplete.When fitting models using data from all districts in Kamphaeng Phet, however, we found similar results.
Transmission of DENV occurs in a seasonal pattern, and several climate variables have been found to increase DENV transmission and/or vector populations (17-19, 35, 36).While prior studies have demonstrated associations between climate variables like average precipitation, relative humidity, temperature, and wind speed, with varying lag times between 0 and 3 months, and dengue incidence (37)(38)(39)(40), our predictive-based analytic framework is not intended to examine causal or associative relationships between climate variables and the outcome of dengue incidence.Our findings suggest that site-specific climate variables aid in site-specific models to predict DENV infection.While visibility has not been found to be associated with dengue incidence, we found that it was the most important climate predictor.It is plausible that visibility serves as a proxy indicator for an underlying factor that affects dengue incidence, such as air pollution, which has been postulated as a contributing factor (41,42).Appropriate lag times would need to be tuned to different sites.For use in a clinical decision support tool, the most recent climate variables could be gathered from online weather sources based on smartphone-based detection of GPS location.An optimal utilization of this model would be through a smartphone application, as there is a scarcity of electronic medical record availability in LMICs.This would necessitate access to a smart phone device and internet connection; however, clinicians and frontline health care workers increasingly have access to smartphone devices, even in remote areas of LMICs (43).
There were significant differences between DENV-positive and DENV-negative patients in 16 of the 22 clinical symptoms collected on presentation, consistent with features known to distinguish dengue from other illnesses (44,45).To minimize clinician input requirements (46), we used random forest regression to identify the optimal variables to derive a parsimonious model.We were able to achieve near-optimal performance with only three clinical variablesage, nausea, and cough.Numerous multivariable models based on clinical presentation have been developed to identify dengue infection in patients with AFI.In a review of published logistic regression prediction models, rash and/or petechiae was the most frequently identified predictor (four of seven models) to discriminate between DENV-positive and DENV-negative patients.When evaluated, the absence of cough was found to be a predictor in 33% of models.Nausea, which was evaluated in four logistic regression models, did not achieve significance in any model.Our results differ from those found in many logistic regression models and align with more intricate models for DENV diagnosis.Models using deep neural networks (47), random forest (48), and gradient boosting (XGBoost) (49) noted that age was the best clinical discriminative predictor.These models did not include cough or nausea as variables for assessment.We found that with the input of as little as one clinical variable-age-along with other predictors can provide useful clinical information (AUC: 67.9%, 95% CI: 65.6 to 70.0), especially in cases where other symptoms cannot be easily obtained, such as in nonverbal or comatose patients.
We show that RS estimates, which reflect the transmission dynamics of disease and the susceptible proportion of a population, improve individual-level clinical prediction on their own.However, there are several factors that make the use of RS estimates problematic, and we favor the use of other location-specific predictors.First, RS estimates may be more difficult to obtain across different settings.Moreover, RS estimates may not serve as a reliable indicator of protection against DENV, as they represent a mixed conceptimmunity may reflect protection due to herd immunity or may indicate increased risk of dengue infection, as higher levels of immunity may reflect higher viral circulation of the multiple DENV serotypes with substantial immunologic cross-reactivity.Last, RS estimates are themselves derived from a model and so should be considered with caution.
Our study has several limitations.First, our model was constructed using data from a single center and testing was limited to patients suspected of having dengue infection, potentially hindering the model's generalizability to a broader population.Similarly, as there was inherent heuristic bias in the patients selected for testing, the clinical components of the model reflect this specific population, meaning that other important predictors of dengue infection, such as fever, were already included in the clinician's decision-making.Our results were limited to internal cross-validation; further studies for external validation are necessary.Last, our assessment of the use of spatial dynamics in DENV transmission was limited as cases were only matched to each district rather than subdistrict or village.In the future, models that integrate cases based on a finer spatial scale may better assess the role of a patient's residing location in prediction.Despite these limitations, we demonstrate that predictive models that include patient-extrinsic location-specific elements can improve prediction and allow for parsimonious models that minimize clinician input and should be considered in future work on clinical prediction and decision support tools.

Location
Kamphaeng Phet is a province in north-central Thailand, which is located 350 km north of Bangkok and has a population of 725,000 people in a mostly rural and semirural setting (33,50).We used data collected from patients presenting to KPPH, a large, tertiary care hospital in the province to identify clinical predictors that could discriminate between DENV-infected and uninfected patients (33,50).

Hospital-based suspected dengue patient data
We used data on over 12,000 patients presenting to KPPH with suspected dengue between August 2007 and December 2021.The data were collected by the U.S. Army Medical Directorate-Armed Forces Research Institute of Medical Sciences.As DENV testing in this hospital is provided free of charge and this is a highly DENV-endemic region, individuals will be tested for DENV infection if there is any suspicion of dengue, however minor.This provides an excellent test case to understand whether individual or location-specific risk factors are associated with testing positive for DENV.
For all suspected dengue cases, we used demographic and clinical information including patient age, sex, home village, admission diagnosis, date of admission, presenting symptoms, and DENV PCR status.The following signs and symptom were recorded as binary variables: fever, chills, malaise, rhinitis, rash, sore throat, seizure, cough, nuchal rigidity, eye pain, nausea, headaches, vomiting, joint pain, abnormal movements, anorexia, myalgias, diarrhea, dark urine, abdominal pain, and bleeding.DENV infection was evaluated using reverse transcription PCR.We recorded the residence of each patient to the district (Amphoe) level using detailed base maps of the region.

Climate variables using NOAA data
Climate and seasonal factors such as temperature, precipitation, and humidity influence vector populations and DENV transmission (17)(18)(19)35).We used the R package GSODR to gather climate data from the central most National Oceanic and Atmospheric Administration (NOAA) weather station in the province of Kamphaeng Phet, Thailand, which included mean daily temperature, precipitation, dew point, relative humidity, sea level pressure, visibility, and wind speed.To better reflect seasonal trends, we aggregated data in 14-day increments before the day of the DENV infection prediction.As climate can alter vector feeding behavior (19,36), we used aggregated climate predictors in the 2 weeks before case presentation.In addition, climate in the months before outbreaks can influence both vector population dynamics and viral replication (19,35).To determine the appropriate lag time for each climate variable, we constructed a random forest classifier with climate variables lagged at 1, 2, and 3 months.Using the R package "vip, " we calculated each variable of importance by AUC and used the best performing lag time for each climate variable.
Estimates of temporal changes in population susceptibility using national surveillance system data We estimate population susceptibility data using age-specific case data from the national surveillance system using data from Kamphaeng Phet province only.We note that most of the cases in this dataset are suspected DENV cases (i.e., without confirmatory testing).We have previously developed models to explicitly link underlying infection risks to the observed age distribution of cases by age and year to estimate annual age-specific FoI in provinces of Thailand up until 2017 (51).The estimates can be used to reconstruct the buildup of immunity in populations by age.Here, we reconstruct population susceptibilities in Kamphaeng Phet going into each year, using only data before the year, to mimic the real-world use, where only prior years' data are available.As dengue disease severity is greatest for secondary infections, we consider two alternative formulations to define susceptibility to disease.First, we consider complete susceptibility, where we use the estimates of the proportion of individuals of an age group and year that are completely seronaive.Second, we consider the proportion of individuals of an age group and year that have experience one prior infection and are therefore at risk of increased risk of severe disease.

Estimates of spatial differences in the underlying FoI using seroprevalence data from a cohort study
To estimate underlying spatial differences in the FoI in the province, we make use of a DENV cohort study in the region, where healthy individuals of all ages from throughout Kamphaeng Phet province have provided blood (52).The cohort is ongoing.We use data from samples collected during baseline blood draws that occurred between 2015 and 2021.Hemagglutination inhibition assays were used to characterize immunity to the four DENV serotypes; individuals were considered seropositive if they had a titer of 10 or greater to any serotype.We have previously used these seroprevalence data to estimate the underlying mean FoI and the proportion of the population that are susceptible to DENV infection in different subdistricts in the province (53).Here, we use these subdistrictspecific estimates to characterize underlying heterogeneity in the FoI in the province.As the cohort data come from 2015 to 2021, however, much of the hospital case data we are working with come from before the cohort, we are assuming that the FoI is stable in time within any location.

Spatial clustering of positive cases based on prior patients presenting to the hospital
The local clustering of positive cases from a single area may signal local ongoing transmission.To assess for a temporal and spatial relationship between cases, we stratified cases that presented to KPPH by both district and province and then summed the number of positive cases in the 30 days before presentation divided by the total cases over the study period from that area.

Statistical analysis and modeling
We fit random forest classifiers to predict DENV infection.Random forests are a machine learning method that constructs a multitude of decision trees and averages over them to obtain a prediction robust to nonlinearities and interactions between covariates and has been widely applied to biomedical sciences for both classification and regression (54,55).
We initially identified the subset of clinical symptoms that were most informative of true infection status.To do this, we fit random forest models using only clinical predictors and then used the R package vip to calculate the variable of importance by AUC for each clinical variable.We determined a variable's importance by calculating the change in AUC after permuting, or randomly shuffling each predictor.To attempt to achieve the most parsimonious prediction rule (i.e., the best predictive model requiring the fewest variables to be input by clinicians), we fit random forest and logistic regression models using training data with consecutively increasing clinical predictor set sizes based on the order of importance and applied this to the test set to determine the smallest model with the best performance.Next, we incorporated the patient-extrinsic factors.We fit each random forest classifier using 1000 decision trees and used the default number of variables to be randomly considered at each node split (mtry = square root of number of candidate variables).In the construction of our predictive models, we input climate predictors, age, susceptibility estimates, and the case clustering metric as continuous variables, and we input the optimized clinical predictors as binary presence or absence categorical variables.Missing predictor data were imputed using the R package "RandomForest." We used logistic regression for each predictor to create a univariate comparison between DENV-positive and DENV-negative cases.We fit multiple logistic regression models to compare the performance of parsimonious models with a random forest classifier using the same number of predictors.
To assess predictive performance for both random forest and logistic regression models, we used repeated cross-validation using 80% training/20% testing splits with 100 iterations.No testing data were used when training the model.In each iteration, predictions on the test set were produced and corresponding measures of performance were obtained.To determine overall model performance, we averaged the AUC and CIs for the 100 iterations.To determine statistical significance between models, we used a bootstrap method over 100 iterations, which involves resampling the data with replacement multiple times, creating bootstrap samples.For each bootstrap sample, receiver operating characteristic (ROC) curves were generated and the differences between the curves were computed.All analyses were completed using R version 4.2.0, and model development/validation was completed in accordance with the TRIPOD checklist (table S1).

Ethical considerations
This study was approved by the institutional review boards of the Thai Ministry of Public Health and Walter Reed Army Institute of Research (no.2119) and the University of Utah (IRB_00150106).

Supplementary Materials
This PDF file includes: Fig. S1 tables S1 to S4

Fig. 1 .
Fig. 1.DENV cases at KPPH, Thailand, 2007-2021.the number of denv cases (green) over total cases (blue) as proportion of AFi cases by year (A) and month (C) and the percentage of positive cases by year (B) and month (D) over the study period.A map of Kamphaeng Phet Province and its 11 districts.colors indicate the number of positive cases (E) and the annual case rate per 100,000 persons (F) within each district between 2007 and 2021.

Fig. 2 .Fig. 3 .
Fig. 2. Average AUC and 95% CIs from cross-validation (100 iterations) for random forest and logistic regression models.the red line indicates an random forest (RF) model with all other predictors (climate, reconstructed susceptibilities estimates, Foi estimates, and prior patients) included.the green line indicates an RF model that includes only clinical predictors.the blue line indicates an logistic regression (lR) model with only clinical predictors included.the dotted lines indicate cis.

Table 2 . The AUCs and CIs by base model, compared to base model plus inclusion of additional data sources
. "clinical" indicates the inclusion of the top three clinical predictors, "climate" indicates the inclusion of climate predictors, "RS" indicates the inclusion of reconstructed susceptibility estimates derived using national surveillance data, "FOi" indicates the inclusion of force of infection estimates derived using cohort data, and "cluster" indicates the recent case cluster metric.